AITopics | last-iterate convergence rate

Collaborating Authors

last-iterate convergence rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

d10f451f973846d20dae8674c493fa3c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 06:02:50 GMT

algorithm, artificial intelligence, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.67)

Add feedback

Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback

Neural Information Processing SystemsFeb-14-2026, 07:55:07 GMT

Our paper takes the first step to remove such assumptions.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
North America > United States > Virginia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Workflow (0.48)

Industry: Leisure & Entertainment (0.45)

Technology:

Information Technology > Game Theory (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

db2d2001f63e83214b08948b459f69f0-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 07:08:39 GMT

algorithm, convergence rate, potential function, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Game Theory (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

eea5d933e9dce59c7dd0f6532f9ea81b-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 00:47:10 GMT

algorithm, arxiv, convergence, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
(8 more...)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Game Theory (0.69)

Add feedback

29c861b02308a57aec18990f0dfe3777-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 23:29:22 GMT

algorithm, assumption 2, regularizer, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.92)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback

Neural Information Processing SystemsDec-26-2025, 01:25:36 GMT

We revisit the problem of learning in two-player zero-sum Markov games, focusing on developing an algorithm that is *uncoupled*, *convergent*, and *rational*, with non-asymptotic convergence rates to Nash equilibrium. We start from the case of stateless matrix game with bandit feedback as a warm-up, showing an $\tilde{\mathcal{O}}(t^{-\frac{1}{8}})$ last-iterate convergence rate. To the best of our knowledge, this is the first result that obtains finite last-iterate convergence rate given access to only bandit feedback. We extend our result to the case of irreducible Markov games, providing a last-iterate convergence rate of $\tilde{\mathcal{O}}(t^{-\frac{1}{9+\varepsilon}})$ for any $\varepsilon> 0$. Finally, we study Markov games without any assumptions on the dynamics, and show a *path convergence* rate, a new notion of convergence we defined, of $\tilde{\mathcal{O}}(t^{-\frac{1}{10}})$. Our algorithm removes the synchronization and prior knowledge requirement of Wei et al. (2021), which pursued the same goals as us for irreducible Markov games. Our algorithm is related to Chen et al. (2021) and Cen et al. (2021)'s and also builds on the entropy regularization technique. However, we remove their requirement of communications on the entropy values, making our algorithm entirely uncoupled.

last-iterate convergence rate, two-player zero-sum markov game, uncoupled and convergent learning, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Tight last-iterate convergence rates for no-regret learning in multi-player games

Neural Information Processing SystemsDec-24-2025, 20:52:05 GMT

We study the question of obtaining last-iterate convergence rates for no-regret learning algorithms in multi-player games. We show that the optimistic gradient (OG) algorithm with a constant step-size, which is no-regret, achieves a last-iterate rate of O(1/ T) with respect to the gap function in smooth monotone games. This result addresses a question of Mertikopoulos & Zhou (2018), who asked whether extra-gradient approaches (such as OG) can be applied to achieve improved guarantees in the multi-agent learning setting. The proof of our upper bound uses a new technique centered around an adaptive choice of potential function at each iteration. We also show that the O(1/ T) rate is tight for all p-SCLI algorithms, which includes OG as a special case. As a byproduct of our lower bound analysis we additionally present a proof of a conjecture of Arjevani et al. (2015) which is more direct than previous approaches.

last-iterate convergence rate, no-regret learning, tight last-iterate convergence rate, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

From Average-Iterate to Last-Iterate Convergence in Games: A Reduction and Its Applications

Cai, Yang, Luo, Haipeng, Wei, Chen-Yu, Zheng, Weiqiang

arXiv.org Artificial IntelligenceNov-11-2025

The convergence of online learning algorithms in games under self-play is a fundamental question in game theory and machine learning. Among various notions of convergence, last-iterate convergence is particularly desirable, as it reflects the actual decisions made by the learners and captures the day-to-day behavior of the learning dynamics. While many algorithms are known to converge in the average-iterate, achieving last-iterate convergence typically requires considerably more effort in both the design and the analysis of the algorithm. Somewhat surprisingly, we show in this paper that for a large family of games, there exists a simple black-box reduction that transforms the average iterates of an uncoupled learning dynamics into the last iterates of a new uncoupled learning dynamics, thus also providing a reduction from last-iterate convergence to average-iterate convergence. Our reduction applies to games where each player's utility is linear in both their own strategy and the joint strategy of all opponents. This family includes two-player bimatrix games and generalizations such as multi-player polymatrix games. By applying our reduction to the Optimistic Multiplicative Weights Update algorithm, we obtain new state-of-the-art last-iterate convergence rates for uncoupled learning dynamics in multi-player zero-sum polymatrix games: (1) an $O(\frac{\log d}{T})$ last-iterate convergence rate under gradient feedback, representing an exponential improvement in the dependence on the dimension $d$ (i.e., the maximum number of actions available to either player); and (2) an $\widetilde{O}(d^{\frac{1}{5}} T^{-\frac{1}{5}})$ last-iterate convergence rate under bandit feedback, improving upon the previous best rates of $\widetilde{O}(\sqrt{d} T^{-\frac{1}{8}})$ and $\widetilde{O}(\sqrt{d} T^{-\frac{1}{6}})$.

artificial intelligence, game theory, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2506.03464

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

Last Iterate Analyses of FTRL in Stochasitc Bandits

Zhan, Jingxin, Han, Yuze, Zhang, Zhihua

arXiv.org Artificial IntelligenceOct-28-2025

The convergence analysis of online learning algorithms is central to machine learning theory, where last-iterate convergence is particularly important, as it captures the learner's actual decisions and describes the evolution of the learning process over time. However, in multi-armed bandits, most existing algorithmic analyses mainly focus on the order of regret, while the last-iterate (simple regret) convergence rate remains less explored -- especially for the widely studied Follow-the-Regularized-Leader (FTRL) algorithms. Recently, a growing line of work has established the Best-of-Both-Worlds (BOBW) property of FTRL algorithms in bandit problems, showing in particular that they achieve logarithmic regret in stochastic bandits. Nevertheless, their last-iterate convergence rate has not yet been studied. Intuitively, logarithmic regret should correspond to a $t^{-1}$ last-iterate convergence rate. This paper partially confirms this intuition through theoretical analysis, showing that the Bregman divergence, defined by the regular function $Ψ(p)=-4\sum_{i=1}^{d}\sqrt{p_i}$ associated with the BOBW FTRL algorithm $1/2$-Tsallis-INF (arXiv:1807.07623), between the point mass on the optimal arm and the probability distribution over the arm set obtained at iteration $t$, decays at a rate of $t^{-1/2}$.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.22819

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.64)

Industry: Education (0.34)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)

Add feedback

Last-Iterate Convergence for Generalized Frank-Wolfe in Monotone Variational Inequalities

Neural Information Processing SystemsOct-10-2025, 17:21:15 GMT

We study the convergence behavior of a generalized Frank-Wolfe algorithm in constrained (stochastic) monotone variational inequality (MVI) problems. In recent years, there have been numerous efforts to design algorithms for solving constrained MVI problems due to their connections with optimization, machine learning, and equilibrium computation in games.

algorithm, inequality, mvi problem, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback